Distant viewing: analyzing large visual corpora
نویسندگان
چکیده
منابع مشابه
Analyzing Word Frequencies in Large Text Corpora Using Inter-arrival Times and Bootstrapping
Comparing frequency counts over texts or corpora is an important task in many applications and scientific disciplines. Given a text corpus, we want to test a hypothesis, such as “word X is frequent”, “word X has become more frequent over time”, or “word X is more frequent in male than in female speech”. For this purpose we need a null model of word frequencies. The commonly used bag-of-words mo...
متن کاملRelevant Expressions in Large Corpora
The automatic extraction of statistically relevant expressions of any language from raw (non-annotated) corpora is a very useful task, specially when it may be used for the study of data from old texts, given the unavailability of informants, the large amount of graphic variants found in that kind of texts and the scarcity of annotated texts. Furthermore, statistically based extraction of compl...
متن کاملRepresenting and Maintaining Large Corpora
The spectrum of electronic corpora which are analysed within the field of corpus linguistics has increased remarkably since the publication of the Brown Corpus (Kucera and Francis, 1967). High efforts of digitizing print media or transcription of speech as well as the limitations of processing speed and storage space made early corpora considerably small in size. With increasing computer perfor...
متن کاملBootstrapping Large Sense Tagged Corpora
The performance of Word Sense Disambiguation systems largely depends on the availability of sense tagged corpora. Since the semantic annotations are usually done by humans, the size of such corpora is limited to a handful of tagged texts. This paper proposes a generation algorithm that may be used to automatically create large sense tagged corpora. The approach is evaluated through comparative ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Digital Scholarship in the Humanities
سال: 2019
ISSN: 2055-7671,2055-768X
DOI: 10.1093/llc/fqz013